Protein Tau’s Role in Gene Expression

Group 25: Ana PAula Rodriguez (s232119), Jacqueline Printz (s194377), Jenni Kinnunen (s204697), João Prazeres (s243036), William Gunns (s242051)

1. Introduction - Protein Tau

  • Function: Microtubule protein essential for cytoskeletal stability and neuronal transport.

  • Supports healthy neuronal functions.

  • Destabilization linked to neuronal dysfunction, and Alzheimer’s Disease.

  • Previous studies concluded that Tau destabilization led to an alteration in the expression of glutamatergic genes.

Experimental Objective:

Is the overexpression of Tau associated to gene expression alterations?

Tau Protein Diagram

2. Experimental Setup

Differential gene expression analysis of RNA-seq data performed on:

  • Control: 3 samples of SH-SY5Y cells with overexpression of a control vector.
  • Experimental Condition: 3 samples of SH-SY5Y cells with overexpression of Tau 0N4R isoform.

RNA-seq data was reported on 3 xls sheets:

  • Read Counts.
  • RPM (Reads Per Million).
  • RPKM (Reads Per Kilobase Million).

The 3 sheets were joined into one large tibble data frame.

# A tibble: 58,395 × 9
   ...1     GeneName description SH_ctrl_1 SH_ctrl_2 SH_ctrl_3 SH_tau_1 SH_tau_2
   <chr>    <chr>    <chr>           <dbl>     <dbl>     <dbl>    <dbl>    <dbl>
 1 ENSG000… TSPAN6   tetraspani…       319       582       280      214      189
 2 ENSG000… TNMD     tenomoduli…         0         0         0        0        0
 3 ENSG000… DPM1     dolichyl-p…       792      1556       781      521      502
 4 ENSG000… SCYL3    SCY1 like …       517       561       445      323      365
 5 ENSG000… C1orf112 chromosome…       533       537       566      601      584
 6 ENSG000… FGR      FGR proto-…         0         0         0        2        2
 7 ENSG000… CFH      complement…         2         0         1        0        0
 8 ENSG000… FUCA2    alpha-L-fu…       487       761       447      341      321
 9 ENSG000… GCLC     glutamate-…       430       703       246      233      218
10 ENSG000… NFYA     nuclear tr…      1101      1156       760      898      583
# ℹ 58,385 more rows
# ℹ 1 more variable: SH_tau_3 <dbl>

3. Data Wrangling

First the data was prepared and made clean by:

1. Joining three RNA sequencing data sheets into one.

2. Renaming columns and naming unamed columns.

3. Removing unecessary and invalid observations.

4. Descreption data (gene_Ensmebl, gene_ID, gene_decription) was saved in a metadata file.

Tidyverse functions used:


full_join: to merge three sheets into one dataframe

mutate: add new columns to dataset


select: to subset and/or remove relevant columns

filter: to subset and/or remove relevant rows

4. Data Augment

Normalized Data

  • Log transformation applied to selected columns.
  • Small value (0.0001) added to avoid zeros in data.

Filtering By Statistical Significance

  • Standard deviation calculated for each gene to filter data.
  • Genes with high SD across replicates were discarded.   #### Calculated Mean
  • Mean values were calculated for control and tau groups.

Log2 Fold Change Filtering

  • Calculated log2 fold change between control and tau groups.
  • Genes with significant log2 fold change (>1 or <-1) were retained.

Final Data

  • Data stored in three separate files for analysis.

5. Data Description part 1

  • Filtered_data, long format: significant genes, all 3 replicates
  • All_data_means; all genes, mean of replicates
  • Log_data: significant genes, mean of replicates and fold change values (for PCA)
# A tibble: 3 × 4
  Data                Rows Columns Genes
  <chr>              <dbl>   <dbl> <dbl>
1 Filtered data long 36648       3  2036
2 All data means     19279      10 19279
3 Log data            2036      10  2036

6. Data Description part 2

  • Genes with greater natural expression appear to be more effected by the increase in Tau protein.
  • The x axis includes negative values for the RPM and RPKM plots, this is due to the log transform of values less than 1. The same effect is not present on the reads graph due to integer values.

7. Analysis PCA

Objective

  • Confirm that RPM, RPKM, and reads yield similar results.
  • Verify differences between control and tau experiments.

Approach

  • Library used : Broom
  • 3 different PCAs performed one for each RPM, RPKM, and reads data.
  • 1 final PCA conducted on combined data.

Results

  • Plots of individual PCAs show how each data type clusters.
  • Final PCA confirms global differences between control and tau groups.

8. Analysis PCA

::: incremental ::: {.column width=“40%”} :::

9. Gene Set Enrichment Analysis

Differential Gene Expression in Tau Overexpression

Computational method to determine if a set of genes shows statistically significant differences in control and Tau overexpressing conditions.

Gene Expression Plot

Over and under expressed genes

Pathway Enrichment Plot

Over and under expressed pathways

10. Discussion and Conclusion

Findings

  • Alpha and Gamma Interferons: cytokines, immune response regulation, trigger JAK-STAT pathway.

  • JAK-STAT3 Pathway Activation:

    • Involved in immunity, cell division, and other cellular processes.
    • Neuroinflammation initiation through immune mechanisms.
    • Association of Tau overexpression with Alzheimer’s disease via JAK-STAT3 pathway activation.
    • Source

Conclusion

Supports the association of Tau overexpression to neurodegenerative disorders like Alzheimer’s disease.